Regret Minimization in Non-Zero-Sum Games with Applications to Building Champion Multiplayer Computer Poker Agents
نویسنده
چکیده
In two-player zero-sum games, if both players minimize their average external regret, then the average of the strategy profiles converges to a Nash equilibrium. For n-player general-sum games, however, theoretical guarantees for regret minimization are less understood. Nonetheless, Counterfactual Regret Minimization (CFR), a popular regret minimization algorithm for extensiveform games, has generated winning three-player Texas Hold’em agents in the Annual Computer Poker Competition (ACPC). In this paper, we provide the first set of theoretical properties for regret minimization algorithms in non-zero-sum games by proving that solutions eliminate iterative strict domination. We formally define dominated actions in extensive-form games, show that CFR avoids iteratively strictly dominated actions and strategies, and demonstrate that removing iteratively dominated actions is enough to win a mock tournament in a small poker game. In addition, for two-player non-zero-sum games, we bound the worst case performance and show that in practice, regret minimization can yield strategies very close to equilibrium. Our theoretical advancements lead us to a new modification of CFR for games with more than two players that is more efficient and may be used to generate stronger strategies than previously possible. Furthermore, we present a new three-player Texas Hold’em poker agent that was built using CFR and a novel game decomposition method. Our new agent wins the three-player events of the 2012 ACPC and defeats the winning three-player Email address: [email protected] (Richard Gibson) URL: http://cs.ualberta.ca/~rggibson/ (Richard Gibson) Preprint submitted to Artificial Intelligence May 2, 2013 ar X iv :1 30 5. 00 34 v1 [ cs .G T ] 3 0 A pr 2 01 3 programs from previous competitions while requiring less resources to generate than the 2011 winner. Finally, we show that our CFR modification computes a strategy of equal quality to our new agent in a quarter of the time of standard CFR using half the memory.
منابع مشابه
Using counterfactual regret minimization to create competitive multiplayer poker agents
Games are used to evaluate and advance Multiagent and Artificial Intelligence techniques. Most of these games are deterministic with perfect information (e.g. Chess and Checkers). A deterministic game has no chance element and in a perfect information game, all information is visible to all players. However, many real-world scenarios with competing agents are stochastic (non-deterministic) with...
متن کاملRegret Minimization in Multiplayer Extensive Games
The counterfactual regret minimization (CFR) algorithm is state-of-the-art for computing strategies in large games and other sequential decisionmaking problems. Little is known, however, about CFR in games with more than 2 players. This extended abstract outlines research towards a better understanding of CFR in multiplayer games and new procedures for computing even stronger multiplayer strate...
متن کاملMonte Carlo Sampling for Regret Minimization in Extensive Games
Sequential decision-making with multiple agents and imperfect information is commonly modeled as an extensive game. One efficient method for computing Nash equilibria in large, zero-sum, imperfect information games is counterfactual regret minimization (CFR). In the domain of poker, CFR has proven effective, particularly when using a domain-specific augmentation involving chance outcome samplin...
متن کاملCounterfactual Regret Minimization in Sequential Security Games
Many real world security problems can be modelled as finite zero-sum games with structured sequential strategies and limited interactions between the players. An abstract class of games unifying these models are the normal-form games with sequential strategies (NFGSS). We show that all games from this class can be modelled as well-formed imperfect-recall extensiveform games and consequently can...
متن کاملMulti-Agent Counterfactual Regret Minimization for Partial-Information Collaborative Games
We study the generalization of counterfactual regret minimization (CFR) to partialinformation collaborative games with more than 2 players. For instance, many 4-player card games are structured as 2v2 games, with each player only knowing the contents of their own hand. To study this setting, we propose a multi-agent collaborative version of Kuhn Poker. We observe that a straightforward applicat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1305.0034 شماره
صفحات -
تاریخ انتشار 2013